65 research outputs found

    Human Pose Estimation from Silhouettes. A Consistent Approach Using Distance Level Sets

    Get PDF

    Combining Implicit Function Learning and Parametric Models for {3D} Human Reconstruction

    Get PDF
    Implicit functions represented as deep learning approximations are powerful for reconstructing 3D surfaces. However, they can only produce static surfaces that are not controllable, which provides limited ability to modify the resulting model by editing its pose or shape parameters. Nevertheless, such features are essential in building flexible models for both computer graphics and computer vision. In this work, we present methodology that combines detail-rich implicit functions and parametric representations in order to reconstruct 3D models of people that remain controllable and accurate even in the presence of clothing. Given sparse 3D point clouds sampled on the surface of a dressed person, we use an Implicit Part Network (IP-Net)to jointly predict the outer 3D surface of the dressed person, the and inner body surface, and the semantic correspondences to a parametric body model. We subsequently use correspondences to fit the body model to our inner surface and then non-rigidly deform it (under a parametric body + displacement model) to the outer surface in order to capture garment, face and hair detail. In quantitative and qualitative experiments with both full body data and hand scans we show that the proposed methodology generalizes, and is effective even given incomplete point clouds collected from single-view depth images. Our models and code can be downloaded from http://virtualhumans.mpi-inf.mpg.de/ipnet

    {LoopReg}: {S}elf-supervised Learning of Implicit Surface Correspondences, Pose and Shape for {3D} Human Mesh Registration

    Get PDF
    We address the problem of fitting 3D human models to 3D scans of dressed humans. Classical methods optimize both the data-to-model correspondences and the human model parameters (pose and shape), but are reliable only when initialized close to the solution. Some methods initialize the optimization based on fully supervised correspondence predictors, which is not differentiable end-to-end, and can only process a single scan at a time. Our main contribution is LoopReg, an end-to-end learning framework to register a corpus of scans to a common 3D human model. The key idea is to create a self-supervised loop. A backward map, parameterized by a Neural Network, predicts the correspondence from every scan point to the surface of the human model. A forward map, parameterized by a human model, transforms the corresponding points back to the scan based on the model parameters (pose and shape), thus closing the loop. Formulating this closed loop is not straightforward because it is not trivial to force the output of the NN to be on the surface of the human model - outside this surface the human model is not even defined. To this end, we propose two key innovations. First, we define the canonical surface implicitly as the zero level set of a distance field in R3, which in contrast to morecommon UV parameterizations, does not require cutting the surface, does not have discontinuities, and does not induce distortion. Second, we diffuse the human model to the 3D domain R3. This allows to map the NN predictions forward,even when they slightly deviate from the zero level set. Results demonstrate that we can train LoopRegmainly self-supervised - following a supervised warm-start, the model becomes increasingly more accurate as additional unlabelled raw scans are processed. Our code and pre-trained models can be downloaded for research

    Transformer-Based Learned Optimization

    Full text link
    We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network. The parameters of the optimizer are then learned by training on a set of optimization tasks with the objective to perform minimization efficiently. Our innovation is a new neural network architecture, Optimus, for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a Transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization-based approaches, our formulation allows for conditioning across the dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms, as well as on the real world-task of physics-based visual reconstruction of articulated 3d human motion.Comment: Accepted to the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR) in Vancouver, Canad

    Canonical Skeletons for Shape Matching

    Get PDF

    {BEHAVE}: {D}ataset and Method for Tracking Human Object Interactions

    Get PDF

    Learning 3D Human Pose from Structure and Motion

    Full text link
    3D human pose estimation from a single image is a challenging problem, especially for in-the-wild settings due to the lack of 3D annotated data. We propose two anatomically inspired loss functions and use them with a weakly-supervised learning framework to jointly learn from large-scale in-the-wild 2D and indoor/synthetic 3D data. We also present a simple temporal network that exploits temporal and structural cues present in predicted pose sequences to temporally harmonize the pose estimations. We carefully analyze the proposed contributions through loss surface visualizations and sensitivity analysis to facilitate deeper understanding of their working mechanism. Our complete pipeline improves the state-of-the-art by 11.8% and 12% on Human3.6M and MPI-INF-3DHP, respectively, and runs at 30 FPS on a commodity graphics card.Comment: ECCV 2018. Project page: https://www.cse.iitb.ac.in/~rdabral/3DPose

    The Alignment Between 3-D Data and Articulated Shapes with Bending Surfaces

    Get PDF
    International audienceIn this paper we address the problem of aligning 3-D data with articulated shapes. This problem resides at the core of many motion tracking methods with applications in human motion capture, action recognition, medical-image analysis, etc. We describe an articulated and bending surface representation well suited for this task as well as a method which aligns (or registers) such a surface to 3-D data. Articulated objects, e.g., humans and animals, are covered with clothes and skin which may be seen as textured surfaces. These surfaces are both articulated and deformable and one realistic way to model them is to assume that they bend in the neighborhood of the shape's joints. We will introduce a surface-bending model as a function of the articulated-motion parameters. This combined articulated-motion and surface-bending model better predicts the observed phenomena in the data and therefore is well suited for surface registration. Given a set of sparse 3-D data (gathered with a stereo camera pair) and a textured, articulated, and bending surface, we describe a register-and-fit method that proceeds as follows. First, the data-to-surface registration problem is formalized as a classifier and is carried out using an EM algorithm. Second, the data-to-surface fitting problem is carried out by minimizing the distance from the registered data points to the surface over the joint variables. In order to illustrate the method we applied it to the problem of hand tracking. A hand model with 27 degrees of freedom is successfully registered and fitted to a sequence of 3-D data points gathered with a stereo camera pair

    MC EMiNEM Maps the Interaction Landscape of the Mediator

    Get PDF
    The Mediator is a highly conserved, large multiprotein complex that is involved essentially in the regulation of eukaryotic mRNA transcription. It acts as a general transcription factor by integrating regulatory signals from gene-specific activators or repressors to the RNA Polymerase II. The internal network of interactions between Mediator subunits that conveys these signals is largely unknown. Here, we introduce MC EMiNEM, a novel method for the retrieval of functional dependencies between proteins that have pleiotropic effects on mRNA transcription. MC EMiNEM is based on Nested Effects Models (NEMs), a class of probabilistic graphical models that extends the idea of hierarchical clustering. It combines mode-hopping Monte Carlo (MC) sampling with an Expectation-Maximization (EM) algorithm for NEMs to increase sensitivity compared to existing methods. A meta-analysis of four Mediator perturbation studies in Saccharomyces cerevisiae, three of which are unpublished, provides new insight into the Mediator signaling network. In addition to the known modular organization of the Mediator subunits, MC EMiNEM reveals a hierarchical ordering of its internal information flow, which is putatively transmitted through structural changes within the complex. We identify the N-terminus of Med7 as a peripheral entity, entailing only local structural changes upon perturbation, while the C-terminus of Med7 and Med19 appear to play a central role. MC EMiNEM associates Mediator subunits to most directly affected genes, which, in conjunction with gene set enrichment analysis, allows us to construct an interaction map of Mediator subunits and transcription factors
    corecore